Search CORE

93 research outputs found

Adding New Tasks to a Single Network with Weight Transformations using Binary Masks

Author: A Mallya
BM Lake
I Kuzborskij
J Kirkpatrick
J Stallkamp
M McCloskey
M Ristin
Mathias Eitz
O Russakovsky
RM French
S Munder
S Thrun
T Mensink
Z Li
Publication venue
Publication date: 14/06/2018
Field of study

Visual recognition algorithms are required today to exhibit adaptive abilities. Given a deep model trained on a specific, given task, it would be highly desirable to be able to adapt incrementally to new tasks, preserving scalability as the number of new tasks increases, while at the same time avoiding catastrophic forgetting issues. Recent work has shown that masking the internal weights of a given original conv-net through learned binary variables is a promising strategy. We build upon this intuition and take into account more elaborated affine transformations of the convolutional weights that include learned binary masks. We show that with our generalization it is possible to achieve significantly higher levels of adaptation to new tasks, enabling the approach to compete with fine tuning strategies by requiring slightly more than 1 bit per network parameter per additional task. Experiments on two popular benchmarks showcase the power of our approach, that achieves the new state of the art on the Visual Decathlon Challenge

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Archivio della ricerca- Università di Roma La Sapienza

Deep Shape Matching

Author: A Chalechale
A Gordo
A Khosla
AS Razavian
EJ Crowley
F Radenović
H Tabia
LVD Maaten
M Eitz
P Sangkloy
P Xu
R Hu
S Bai
S Parui
S Wang
S Zhang
Y Kalantidis
Z Xu
Publication venue
Publication date: 25/07/2018
Field of study

We cast shape matching as metric learning with convolutional networks. We break the end-to-end process of image representation into two parts. Firstly, well established efficient methods are chosen to turn the images into edge maps. Secondly, the network is trained with edge maps of landmark images, which are automatically obtained by a structure-from-motion pipeline. The learned representation is evaluated on a range of different tasks, providing improvements on challenging cases of domain generalization, generic sketch-based image retrieval or its fine-grained counterpart. In contrast to other methods that learn a different model per task, object category, or domain, we use the same network throughout all our experiments, achieving state-of-the-art results in multiple benchmarks.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref

Aligning Salient Objects to Queries: A Multi-modal and Multi-object Image Retrieval Framework

Author: A Das
A Gordo
J Munkres
JM Saavedra
M Eitz
M Paulin
R Hu
R Krishna
T Lan
T-Y Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/10/2019
Field of study

This is the author accepted manuscript. The final version is available from Springer Verlag via the DOI in this recordACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, 2-6 December 2018In this paper we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model learned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every dataset.European Union Horizon 2020CERCA Program of Generalitat de Cataluny

Crossref

Open Research Exeter

Development and implementation of a web-enabled 3D consultation tool for breast augmentation surgery based on 3D-image reconstruction of 2D pictures

Author: Eitz M
Irving Dindoyal
Jaime Garcia
Mauricio Reyes
Mihai Constantinescu
Pablo de Heras Ciechomski
Radu Olariu
Serge Le Huu
Stull A
Publication venue: 'JMIR Publications Inc.'
Publication date: 01/01/2012
Field of study

Producing a rich, personalized Web-based consultation tool for plastic surgeons and patients is challenging

Crossref

PubMed Central

Bern Open Repository and Information System (BORIS)

Inner Space Preserving Generative Pose Machine

Author: A Dosovitskiy
A Newell
B Hariharan
C Farabet
D Anguelov
D Yoo
F Ning
G Larsson
GE Hinton
J Walker
LC Chen
M Bergtholdt
M Eitz
M Loper
MM Loper
O Ronneberger
PY Laffont
R Zhang
S Iizuka
V Badrinarayanan
V Jampani
X Yan
Y Shih
Y Yang
Publication venue
Publication date: 06/08/2018
Field of study

Image-based generative methods, such as generative adversarial networks (GANs) have already been able to generate realistic images with much context control, specially when they are conditioned. However, most successful frameworks share a common procedure which performs an image-to-image translation with pose of figures in the image untouched. When the objective is reposing a figure in an image while preserving the rest of the image, the state-of-the-art mainly assumes a single rigid body with simple background and limited pose shift, which can hardly be extended to the images under normal settings. In this paper, we introduce an image "inner space" preserving model that assigns an interpretable low-dimensional pose descriptor (LDPD) to an articulated figure in the image. Figure reposing is then generated by passing the LDPD and the original image through multi-stage augmented hourglass networks in a conditional GAN structure, called inner space preserving generative pose machine (ISP-GPM). We evaluated ISP-GPM on reposing human figures, which are highly articulated with versatile variations. Test of a state-of-the-art pose estimator on our reposed dataset gave an accuracy over 80% on PCK0.5 metric. The results also elucidated that our ISP-GPM is able to preserve the background with high accuracy while reasonably recovering the area blocked by the figure to be reposed.Comment: http://www.northeastern.edu/ostadabbas/2018/07/23/inner-space-preserving-generative-pose-machine

arXiv.org e-Print Archive

Crossref

Free-hand sketch synthesis with deformable stroke models

Author: B Gooch
C Guo
D DeCarlo
E Saund
F Cole
H Fu
I Berger
J Canny
J Shotton
L Breiman
M Eitz
M Minear
P Arbelaez
PF Felzenszwalb
PF Felzenszwalb
RG Schneider
S Belongie
Shaogang Gong
T Judd
Timothy M. Hospedales
V Ferrari
X Ren
X Wang
Yi Li
Yi-Zhe Song
YN Wu
Z Huang
Z Xu
Publication venue: Springer Verlag (Germany)
Publication date: 09/10/2015
Field of study

We present a generative model which can automatically summarize the stroke composition of free-hand sketches of a given category. When our model is fit to a collection of sketches with similar poses, it discovers and learns the structure and appearance of a set of coherent parts, with each part represented by a group of strokes. It represents both consistent (topology) as well as diverse aspects (structure and appearance variations) of each sketch category. Key to the success of our model are important insights learned from a comprehensive study performed on human stroke data. By fitting this model to images, we are able to synthesize visually similar and pleasant free-hand sketches

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Edinburgh Research Explorer

Queen Mary Research Online

Surrey Research Insight

Transferring Neural Representations for Low-dimensional Indexing of Maya Hieroglyphic Art

Author: E Roman-Rangel
E Roman-Rangel
G Hinton
I Jolliffe
JES Thompson
L van der Maaten
M Eitz
MD Zeiler
P Felzenszwalb
R Hu
S Lloyd
Y Bengio
Publication venue
Publication date: 01/01/2016
Field of study

We analyze the performance of deep neural architectures for extracting shape representations of binary images, and for generating low-dimensional representations of them. In particular, we focus on indexing binary images exhibiting compounds of Maya hieroglyphic signs, referred to as glyph-blocks, which constitute a very challenging dataset of arts given their visual complexity and large stylistic variety. More precisely, we demonstrate empirically that intermediate outputs of convolutional neural networks can be used as representations for complex shapes, even when their parameters are trained on gray-scale images, and that these representations can be more robust than traditional handcrafted features. We also show that it is possible to compress such representations up to only three dimensions without harming much of their discriminative structure, such that effective visualization of Maya hieroglyphs can be rendered for subsequent epigraphic analysis

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Archive ouverte UNIGE

Identifying core MRI sequences for reliable automatic brain metastasis segmentation

BACKGROUND Many automatic approaches to brain tumor segmentation employ multiple magnetic resonance imaging (MRI) sequences. The goal of this project was to compare different combinations of input sequences to determine which MRI sequences are needed for effective automated brain metastasis (BM) segmentation. METHODS We analyzed preoperative imaging (T1-weighted sequence ± contrast-enhancement (T1/T1-CE), T2-weighted sequence (T2), and T2 fluid-attenuated inversion recovery (T2-FLAIR) sequence) from 339 patients with BMs from seven centers. A baseline 3D U-Net with all four sequences and six U-Nets with plausible sequence combinations (T1-CE, T1, T2-FLAIR, T1-CE + T2-FLAIR, T1-CE + T1 + T2-FLAIR, T1-CE + T1) were trained on 239 patients from two centers and subsequently tested on an external cohort of 100 patients from five centers. RESULTS The model based on T1-CE alone achieved the best segmentation performance for BM segmentation with a median Dice similarity coefficient (DSC) of 0.96. Models trained without T1-CE performed worse (T1-only: DSC = 0.70 and T2-FLAIR-only: DSC = 0.73). For edema segmentation, models that included both T1-CE and T2-FLAIR performed best (DSC = 0.93), while the remaining four models without simultaneous inclusion of these both sequences reached a median DSC of 0.81-0.89. CONCLUSIONS A T1-CE-only protocol suffices for the segmentation of BMs. The combination of T1-CE and T2-FLAIR is important for edema segmentation. Missing either T1-CE or T2-FLAIR decreases performance. These findings may improve imaging routines by omitting unnecessary sequences, thus allowing for faster procedures in daily clinical practice while enabling optimal neural network-based target definitions

ZORA

Fast character modeling with sketch-based PDE surfaces

Author: A Bermano
A Nealen
A Nealen
A Rivers
Allah Bux Sargano
B Botsch
B Jones
B Li
B Wang
B Xu
C Ding
C Li
D Hahm
H Huang
H Johnston
IK Kazmi
Ismail Kazmi
J Mezger
Jian J. Zhang
JJ Zhang
Junjun Pan
K Xu
Kun Qian
L Olsen
LH You
LH You
LH You
Lihua You
M Botsch
M Eitz
M Zou
OA Arqub
OA Arqub
OA Arqub
S Bouaziz
Shaojun Bian
T Funkhouser
Tong-Yee Lee
W Zhou
W Zhou
X Guo
X Han
Xiaosong Yang
Y Gingold
Y Gryaditskaya
Y Li
Zulfiqar Habib
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2020
Field of study

© 2020, The Author(s). Virtual characters are 3D geometric models of characters. They have a lot of applications in multimedia. In this paper, we propose a new physics-based deformation method and efficient character modelling framework for creation of detailed 3D virtual character models. Our proposed physics-based deformation method uses PDE surfaces. Here PDE is the abbreviation of Partial Differential Equation, and PDE surfaces are defined as sculpting force-driven shape representations of interpolation surfaces. Interpolation surfaces are obtained by interpolating key cross-section profile curves and the sculpting force-driven shape representation uses an analytical solution to a vector-valued partial differential equation involving sculpting forces to quickly obtain deformed shapes. Our proposed character modelling framework consists of global modeling and local modeling. The global modeling is also called model building, which is a process of creating a whole character model quickly with sketch-guided and template-based modeling techniques. The local modeling produces local details efficiently to improve the realism of the created character model with four shape manipulation techniques. The sketch-guided global modeling generates a character model from three different levels of sketched profile curves called primary, secondary and key cross-section curves in three orthographic views. The template-based global modeling obtains a new character model by deforming a template model to match the three different levels of profile curves. Four shape manipulation techniques for local modeling are investigated and integrated into the new modelling framework. They include: partial differential equation-based shape manipulation, generalized elliptic curve-driven shape manipulation, sketch assisted shape manipulation, and template-based shape manipulation. These new local modeling techniques have both global and local shape control functions and are efficient in local shape manipulation. The final character models are represented with a collection of surfaces, which are modeled with two types of geometric entities: generalized elliptic curves (GECs) and partial differential equation-based surfaces. Our experiments indicate that the proposed modeling approach can build detailed and realistic character models easily and quickly

Crossref

Bournemouth University Research Online

Sketch-a-Net: A Deep Neural Network that Beats Humans

Author: BA Olshausen
D Gabor
DH Hubel
E Yanık
Feng Liu
G Johnson
GE Hinton
J Schmidhuber
K Fukushima
M Eitz
P Sousa
Qian Yu
R Hu
S Schaefer
T Lu
Tao Xiang
Timothy M. Hospedales
Yi-Zhe Song
Yongxin Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/07/2016
Field of study

This Project received support from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement #640891, and the Royal Society and Natural Science Foundation of China (NSFC) Joint Grant #IE141387 and #61511130081. We gratefully acknowledge the support of NVIDIA Corporation for the donation of the GPUs used for this research

Crossref

Edinburgh Research Explorer

Queen Mary Research Online

Surrey Research Insight